Skip to content

Replace post meta sync storage with dedicated sync_updates table#11068

Open
josephfusco wants to merge 24 commits intoWordPress:trunkfrom
josephfusco:feature/sync-updates-table
Open

Replace post meta sync storage with dedicated sync_updates table#11068
josephfusco wants to merge 24 commits intoWordPress:trunkfrom
josephfusco:feature/sync-updates-table

Conversation

@josephfusco
Copy link

The real-time collaboration sync layer currently stores messages as post meta, which works but creates side effects at scale. This moves it to a dedicated wp_sync_updates table purpose-built for the workload.

The beta1 implementation stores sync messages as post meta on a private wp_sync_storage post type. Post meta is designed for static key-value data, not high-frequency transient message passing. This mismatch causes:

  1. Cache thrashing — Every sync write triggers wp_cache_set_posts_last_changed(), invalidating site-wide post query caches unrelated to collaboration.

  2. Compaction race condition — The "delete all, re-add some" pattern in remove_updates_before_cursor() loses messages under concurrent writes. The window between delete_post_meta() and the add_post_meta() loop is unprotected.

  3. Cursor race condition — Timestamp-based cursors (microtime() * 1000) miss updates when two writes land within the same millisecond.

A purpose-built table with auto-increment IDs eliminates all three at the root: no post meta hooks fire, compaction is a single atomic DELETE, and auto-increment IDs guarantee unique ordering. The WP_Sync_Storage interface and WP_HTTP_Polling_Sync_Server are unchanged.

Also adds a wp_sync_storage filter so hosts can substitute alternative backends (Redis, WebSocket) without patching core, and includes a beta1 upgrade path that cleans up orphaned wp_sync_storage posts.

Credits

Trac ticket: https://core.trac.wordpress.org/ticket/64696

Use of AI Tools

Co-authored with Claude Code (Opus 4.6), used to synthesize discussion across related tickets and PRs into a single implementation. All code was reviewed and tested before submission.


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@github-actions
Copy link

github-actions bot commented Feb 26, 2026

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props joefusco, peterwilsoncc, mindctrl, westonruter, paulkevan, dd32.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@josephfusco josephfusco changed the title feat: replace post meta sync storage with dedicated sync_updates table Replace post meta sync storage with dedicated sync_updates table Feb 26, 2026
Copy link
Contributor

@peterwilsoncc peterwilsoncc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass note inline.

This approach had occurred to me and I think it's probably quite a good idea.

But, as always, there is a but.

I know @m has been reluctant in the past to add new tables to the database schema. I guess (and Matt will be able to correct me) that's to avoid ending up with a database scheme of dozens and dozens of tables.

As not all sites run the database upgrade routine on a regular basis (I think wordpress.org is often behind) as it can be quite a burden on a site, if this approach is to be taken it would need to include a check for the table's existence in a couple of locations:

  • The option_{$option} hook
  • Before displaying the options field on the writing page

Some form of filtering would need to be available for sites that are using custom data stores.

Sorry to be that person. I'm not trying to be negative, just share knowledge I've learnt over the last few years.

@josephfusco
Copy link
Author

@peterwilsoncc Thanks for the thorough review — please don't apologize, this is exactly the kind of institutional knowledge that's invaluable.

As not all sites run the database upgrade routine on a regular basis (I think wordpress.org is often behind) as it can be quite a burden on a site, if this approach is to be taken it would need to include a check for the table's existence in a couple of locations:

Completely agree. I believe I've audited every code path that reads wp_enable_real_time_collaboration to map out what actually needs protection.

  • The option_{$option} hook

When the setting is saved, the flow through options.php → update_option() writes to the wp_options table. No code is hooked onto that option change that would interact with the sync_updates table, so toggling the setting is safe regardless of table state.

  • Before displaying the options field on the writing page

The checkbox in options-writing.php calls get_option() to check/uncheck itself — it doesn't query the sync_updates table, so it renders safely whether or not the upgrade has run.

The code path that touches the table is the REST route registration in rest-api.php:433, which instantiates WP_Sync_Table_Storage. That's already gated with a db_version >= 61697 check (c13f1cd), so the table is never accessed on sites that haven't run the upgrade.

Some form of filtering would need to be available for sites that are using custom data stores.

The wp_sync_storage filter (line 443) should cover this — it lets sites swap out WP_Sync_Table_Storage for any custom WP_Sync_Storage implementation (Redis, etc.).

Let me know if I've missed anything or if you think additional safeguards would be worthwhile!

@josephfusco josephfusco force-pushed the feature/sync-updates-table branch 2 times, most recently from 0104993 to c13f1cd Compare February 27, 2026 16:40
@westonruter
Copy link
Member

@peterwilsoncc:

As not all sites run the database upgrade routine on a regular basis (I think wordpress.org is often behind) as it can be quite a burden on a site, if this approach is to be taken it would need to include a check for the table's existence in a couple of locations:

  • The option_{$option} hook
  • Before displaying the options field on the writing page

Some form of filtering would need to be available for sites that are using custom data stores.

Yeah, this has been an issue, but I think it's primarily/only really an issue for database tables that are used for frontend features. Given that the proposed wp_sync_updates table here is exclusively used in the post editor which is in the admin, and access to the admin requires first doing a DB upgrade screen, then I think this wouldn't be a concern.

I know Matt has been reluctant in the past to add new tables to the database schema. I guess (and Matt will be able to correct me) that's to avoid ending up with a database scheme of dozens and dozens of tables.

I haven't touched custom tables for a long time, so I'm not aware of the state of the art here. But I do know that WooCommerce made the switch to using custom tables instead of postmeta for the sake of scalabilitiy, simplicity, and reliability. Doing a quick search in WPDirectory and I see that Jetpack, Yoast, Elementor, WPForms, WordFence, and many others use custom tables as well.

@westonruter

This comment was marked as off-topic.

@josephfusco josephfusco force-pushed the feature/sync-updates-table branch from 355e7f2 to 7f57c63 Compare February 27, 2026 21:53
@peterwilsoncc
Copy link
Contributor

peterwilsoncc commented Mar 1, 2026

Yeah, this has been an issue, but I think it's primarily/only really an issue for database tables that are used for frontend features. Given that the proposed wp_sync_updates table here is exclusively used in the post editor which is in the admin, and access to the admin requires first doing a DB upgrade screen, then I think this wouldn't be a concern.

This isn't quite correct, on multi site installs site (ie, non super admins) don't get prompted to upgrade the database tables and globally upgrading tables by super admins can be prevented using a filter:

add_filter( 'map_meta_cap', function( $caps, $cap ) {
	if ( 'upgrade_network' === $cap ) {
		$caps[] = 'do_not_allow';
	}
	return $caps;
}, 10, 4 );

To reproduce:

  1. Install MS
  2. Create a sub site
  3. Add a user to the network
  4. Add the new user as an admin on the sub site
  5. Login as sub site admin
  6. Bump DB version in version.php
  7. Visit dashboard as sub site admin
  8. Observe lack of db upgrade prompt
  9. Login as super admin
  10. Observe but don't click db upgrade prompt
  11. You can ignore the prompt as long as you wish as a super admin
  12. Add the code above as a mu-plugin
  13. Prompt disappears
  14. Visit /wp-admin/network/upgrade.php
  15. Observe permission denied message
  16. Remove mu-plugin added above to avoid confusion later.

Keep in mind that the sub site admin can enable site RTC on the writing page, even though they can't update the tables.

I haven't touched custom tables for a long time, so I'm not aware of the state of the art here. But I do know that WooCommerce made the switch to using custom tables instead of postmeta for the sake of scalabilitiy, simplicity, and reliability. Doing a quick search in WPDirectory and I see that Jetpack, Yoast, Elementor, WPForms, WordFence, and many others use custom tables as well.

Not really relevant for the core schema.

@westonruter
Copy link
Member

Not really relevant for the core schema.

Seems relevant to me, as plugins may opt to introduce new tables if the core schema doesn't provide an efficient way to store the desired data. In the same way, if a new feature for core can't be represented efficiently in the core schema, then so too should a new table be considered.

@peterwilsoncc
Copy link
Contributor

@westonruter Well, no, because my point wasn't about whether the tables would be more performant (I said it's probably a good idea in my comment), my point was that Matt has a reluctance to add more tables to the database schema. Whether that reluctance still remains is worth asking but if it does then the plugin actions aren't relevant to core.

josephfusco and others added 5 commits March 3, 2026 11:22
Co-authored-by: Peter Wilson <519727+peterwilsoncc@users.noreply.github.com>
Adds a db_version check before registering the collaboration REST
routes so sites that haven't run the upgrade routine yet don't hit
a fatal error from the missing sync_updates table.
Schedule a daily cron job to delete sync update rows older than
1 day, preventing unbounded table growth from abandoned
collaborative editing sessions. The expiration is filterable
via the wp_sync_updates_expiration hook.
Store the room identifier string directly instead of its md5 hash.
Room strings like "postType/post:42" are short, already validated by
the REST API schema, and storing them verbatim improves debuggability.
josephfusco and others added 12 commits March 3, 2026 11:22
Hardcode WEEK_IN_SECONDS for cron cleanup, matching the auto-draft
cleanup precedent in core. Remove the wp_sync_updates_expiration filter
to keep the API surface minimal for v1.
The room string is already constrained by REST schema regex
validation, but esc_html() future-proofs against the regex
loosening or the message being reused in an HTML context.
Cover gaps in sync_updates test coverage:

- Cron deletes rows older than 7 days
- Cron preserves rows within the 7-day window
- Cron boundary behavior at exactly 7 days
- Cron selective deletion with mixed old and recent rows
- Cleanup hook registered in default-filters.php
- Sync routes not registered when db_version is below 61698
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Extract the db_version >= 61698 and option check into a named
function, following the precedent set by
wp_check_term_meta_support_prefilter() in taxonomy.php.
Use the new helper at all remaining call sites: the script injection
in collaboration.php, the cron cleanup guard, the REST route
registration in rest-api.php, and the cron scheduling in admin.php.

The cron handler now also checks the option, which is correct: if
collaboration is disabled, cleanup should be skipped too.
Add Playwright tests covering presence awareness, real-time sync,
and undo/redo for the collaboration feature, along with shared
test fixtures and utilities.
Adds missing @SInCE tags to file headers, fixes the orphaned
set_awareness_state docblock in WP_Sync_Table_Storage, adds @var
to the $storage property, adds interface-level @SInCE, and adds
JSDoc file headers to all E2E test and fixture files.
Extract repeated patterns across collaboration test files into
CollaborationUtils methods: createCollaborativePost,
insertBlockViaEvaluate, assertEditorHasContent, and
assertAllEditorsHaveContent. Replace hardcoded timeout values
with the exported SYNC_TIMEOUT constant. Fix docblock for
get_updates_after_cursor in the WP_Sync_Storage interface.
@josephfusco josephfusco force-pushed the feature/sync-updates-table branch from efaeac8 to 255cde9 Compare March 3, 2026 16:24

global $wpdb;

$wpdb->query(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this cleanup routine is running via cron, I wonder if it should be more resilient to large amounts of data removal - imagine in a scenario of 1,000 updates being removed in one query, we should perhaps batch process this in reasonable amounts or grab the correct data from a select query first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would definitely be a concern in the case of deleting 1,000 posts or postmeta, given that actions are expected to run for each deletion. But a SQL DELETE like this should be very fast. For prior art, the delete_expired_transients() function does a similar DELETE query and it also lacks any LIMIT or batching. And this is even with option_value not being indexed. To ensure the deletion of wp_sync_updates is as fast or faster, an index could be added to created_at: https://github.com/WordPress/wordpress-develop/pull/11068/changes#r2879822934

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume updates that are a week+ old are safe to delete? I asked Gemini this question and it said:

The wp_delete_old_sync_updates function unconditionally deletes updates older than 7 days.

  • Impact: If a user does not sync for > 7 days, or if the latest "compaction" (snapshot) update is > 7 days old, the server will delete the history they rely on. When the client reconnects, the server will return "no updates" (or a partial list starting after the gap). The client will assume it is up-to-date or apply updates on top of an inconsistent state, leading to document corruption.
  • Context: The server does not signal "History Lost" or force a resync. The removal of history is safer than the old infinite-growth model, but without a protocol to detect "too old" cursors (e.g., checking if requested_cursor < min_table_id), this causes silent data corruption for offline users.

Copy link
Author

@josephfusco josephfusco Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkevan @westonruter Agreed - a single DELETE on the indexed created_at column should be fast even at volume, same as delete_expired_transients() handles it today without batching.

@mindctrl The Gemini analysis assumes a client could hold a stale cursor across the 7-day window, but the cursor is in-memory and resets to 0 on every editor load - a returning user always starts fresh.

The behavior here was modeled after wp_delete_auto_drafts(): daily cron, hardcoded 7-day threshold, non-filterable. Auto-drafts are orphaned posts from abandoned editing sessions; sync updates are orphaned rows from abandoned collaboration sessions. Compaction keeps row volume low during active use, so the cleanup is a safety net for abandoned sessions, not a routine data path.

public function set_awareness_state( string $room, array $awareness ): bool {
// Awareness is high-frequency, short-lived data (cursor positions, selections)
// that doesn't need cursor-based history. Transients avoid row churn in the table.
return set_transient( $this->get_awareness_transient_key( $room ), $awareness, MINUTE_IN_SECONDS );
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will using transients that expire in 1 minute just be somewhat wasteful resource-wise. Could this just not be implemented in the custom table with an appropriate cleanup routine?

Copy link
Author

@josephfusco josephfusco Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Awareness is last-write-wins - only the latest state per room matters, no history. Transients are appropriate because they self-expire via the 1-minute TTL (handles presence disappearing when a user leaves) and on sites with an external object cache, set_transient() routes through wp_cache_set() so these never hit the database.

Moving it into the custom table would add more overhead - an INSERT per poll per user per room during active editing, plus another cleanup routine for stale awareness rows. The transient approach avoids that churn and keeps the table focused on ordered sync updates.

public function get_updates_after_cursor( string $room, int $cursor ): array {
global $wpdb;

// Snapshot the current max ID for this room to define a stable upper bound.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be useful to see some analysis of how these new queries perform with large data sets, to see if any optimizations are possible or needed.


global $wpdb;

$wpdb->query(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume updates that are a week+ old are safe to delete? I asked Gemini this question and it said:

The wp_delete_old_sync_updates function unconditionally deletes updates older than 7 days.

  • Impact: If a user does not sync for > 7 days, or if the latest "compaction" (snapshot) update is > 7 days old, the server will delete the history they rely on. When the client reconnects, the server will return "no updates" (or a partial list starting after the gap). The client will assume it is up-to-date or apply updates on top of an inconsistent state, leading to document corruption.
  • Context: The server does not signal "History Lost" or force a resync. The removal of history is safer than the old infinite-growth model, but without a protocol to detect "too old" cursors (e.g., checking if requested_cursor < min_table_id), this causes silent data corruption for offline users.

josephfusco and others added 4 commits March 3, 2026 14:18
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Replace the opaque JSON blob with dedicated columns so the sync_updates
table is legible and queryable without JSON parsing. Removes the
json_encode/json_decode round-trip on every write and read.

No migration needed — the table has not shipped in a stable release.

Props mindctrl.
@peterwilsoncc
Copy link
Contributor

@westonruter @josephfusco As mentioned in Slack and on the ticket, Matt has approved an additional table for RTC.

As this is a first design and wasn't discussed architecturally, I would like to slow right down and consider the architecture on the ticket prior to proceeding here. As you'll have seen, I've posted in slack but will comment on the ticket once I've thought about things further.

Prior to the discussion, I think it's best if this PR is put on hold so we can focus on any other RTC issues in the meantime. I think there's a few things in the JavaScript client and the polling class that need work in the mean time.

josephfusco and others added 2 commits March 3, 2026 16:19
Co-authored-by: Weston Ruter <westonruter@gmail.com>
…ble.

Reverts the column split from c920b97 and removes the round-trip test
from ceccbb4. The sync_updates table is an append-only event log and
should remain protocol-agnostic — the storage layer stores and orders;
the sync server interprets.

This reverts commit c920b97.
This reverts commit ceccbb4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants